This is part 1 of the 2 part course from CDRC on using the UK Retail Centres dataset to create retail catchments. The video in this part introduces the Retail Centres data set, and the practical session shows you how to work with the Retail Centres dataset in R. If you are completely new to RStudio, please check out our Short Course on Using R as a GIS.
After completing this material, you will:
First we want to set up the libraries that we are going to be using for this practical - most of them should be familiar (e.g. dplyr, sf, tmap). However, the hereR library will likely be new - we are going to use this later, so install it for now and we will come back to it later.
library(sf)
library(dplyr)
library(tmap)
# install.packages("hereR")
library(hereR)
Make sure you have set your working directory to wherever you have stored the CDRC-Retail-Catchment-Training folder we have provided:
#setwd("CDRC-Retail-Catchment-Training")
The first dataset we are going to be using is a geopackage containing the updated Retail Centre boundaries for the UK, which is within your data packs for this practical. We have given you a subset for the Liverpool City Region (LCR), containing the top 20 Retail Centres within the LCR.
All datasets you need for this practical are contained within the 'Data' subfolder within the CDRC-Retail-Catchment-Training folder we provided.
## Read in the Retail Centre Boundaries
rc <- st_read("Data/LCR_Retail_Centres_2018.gpkg")
## Reading layer `LiverpoolCR_Training2' from data source `C:\Users\sgpballa\Google Drive\Patrick Academic\POSTGRAD\CDRC Retail Project\Retail Training\CDRC-Retail-Catchment-Training\Data\LCR_Retail_Centres_2018.gpkg' using driver `GPKG'
## Simple feature collection with 20 features and 5 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: 321011 ymin: 381441.1 xmax: 352342.2 ymax: 418208.7
## CRS: 27700
Let's take a look at the Retail Centre data:
## head() displays the first few rows of a dataframe/simple feature
head(rc)
## Simple feature collection with 6 features and 5 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: 329579.5 ymin: 387801 xmax: 339819.7 ymax: 407399.8
## CRS: 27700
## rcID n.units n.comp.units rcName RetailPark
## 1 RC_EW_2713 189 44 Wallasey N
## 2 RC_EW_2720 76 25 Formby N
## 3 RC_EW_2723 75 28 Waterloo N
## 4 RC_EW_2746 140 27 Allerton Road N
## 5 RC_EW_2751 957 257 Liverpool City N
## 6 RC_EW_2756 308 92 Birkenhead N
## geom
## 1 POLYGON ((330413.8 391857.4...
## 2 POLYGON ((329908.9 407152.4...
## 3 POLYGON ((332200.5 398543.7...
## 4 POLYGON ((339115.6 388386, ...
## 5 POLYGON ((334054.8 390283.9...
## 6 POLYGON ((331231.6 388547.5...
So for each Retail Centre (polygon):
Let's map the Retail Centre polygons. Throughout this practical we will be using the tmap R package for all exploratory mapping - for more info on tmap visit: https://github.com/mtennekes/tmap.
In the next chunk of code i first setup the plotting mode i want to use throughout this practical (tmap_mode). The default option is "plot" which will produce any of your plots over a plain backgrond. By setting tmap_mode to "view" your layers are plotted automatically over an interactive leaflet basemap. Then I plot the Retail Centres by calling the rc object in the tm_shape() command, before specifying tm_fill() as the data is polygons not points or lines. Finally, i call tm_text() to label each of the polygons, specifying that i want the labels to be drawn from the rcName column - containing the names of the Retail Centres.
## Setup plotting
tmap_mode("view")
## Map Retail Centre Polygons for LCR
tm_shape(rc) +
tm_fill(col = "orange") +
tm_text("rcName", size = 0.75)
For the purpose of this analysis we want to extract the centroids of the Retail Centres to work with and construct catchments from. For those unfamiliar to centroids, they enable extraction of points from polygons, by taking the central point of the polygon. It is common practice when constructing catchments to do so with point data, so in this sense centroids are very helpful. The sf package has a neat st_centroid() function for extracting centroids from polygons.
## Extract centroids
rc_cent <- st_centroid(rc)
Let's map the Retail Centre Centroids:
## Map the centroids - note: tm_dots() is used as the object rc_cent contains point data (Retail Centre centroids)
tm_shape(rc_cent) +
tm_dots(col = "orange") +
tm_text("rcName", size = 0.75)
In this section we are going to build a three-tier hierarchy of Retail Centres. The hierarchy will be based on two things: the total number of comparison units in the centre (n.comp.units), and whether or not the Retail Centre is considered a major retail park.
In this next chunk of code I use the mutate function and case_when statements to generate a new column called 'hierarchy' where the Retail Centres are assigned as being 'primary', 'secondary' or 'tertiary' based on the number of comparison units in a centre (n.comp.units) and the value for Retail Park.
The conditions are set as follows:
Notice how I am using & and | in the case_when statements. For example, in the first assignment (tertiary), for a centre to be tertiary it has to have less than 50 comparison units and not be a Retail Park ("N"). In the second example, we assign secondary centres as either those with comparison unit count between 50 and 99 OR those that are Retail Parks ("Y"). Notice how i have used the | symbol to account for the OR condiiton in the assignment of secondary centres.
## Use mutate and case_when to create the new column - notice how ~ assigns the new values based on the condition
rc_cent_nopipes <- mutate(rc_cent,
hierarchy = dplyr::case_when(n.comp.units < 50 & RetailPark == "N" ~ "tertiary",
(n.comp.units >= 50 & n.comp.units < 100) | RetailPark == "Y" ~ "secondary",
n.comp.units >= 100 ~ "primary"))
I then want to extract only certain columns from the data, which i can achieve using the select() function from dplyr:
## Use select to extract the id, n.units, n.comp.units and hierarchy columns
rc_cent_nopipes <- select(rc_cent_nopipes, rcID, rcName, n.units, n.comp.units, RetailPark, hierarchy)
However, there is an alternative to writing two separate chunks of code to create the hierarchy and then extract the columns we want. You may have seen pipes before - %>% - as they are commonly used to integrate various dplyr functions together. They work by taking the output from one function (e.g. the creation of the hierarchy column with mutate) and feed it into the first argument of the next function (e.g. selection of columns with select). I'll show you how you can do it below:
## Use pipes to create the new hierarchy column and then select the other columns we are interested in
rc_cent <- rc_cent %>%
mutate(hierarchy = dplyr::case_when(n.comp.units < 50 & RetailPark == "N" ~ "tertiary",
(n.comp.units >= 50 & n.comp.units < 100) | RetailPark == "Y" ~ "secondary",
n.comp.units >= 100 ~ "primary")) %>%
select(rcID, rcName, n.units, n.comp.units, RetailPark, hierarchy)
Notice how when piping you don't need to tell the mutate() and select() functions what object you want to perform that operation on, as it is piping directly from rc_cent. For clarity, let's compare the output with and without pipes:
## Without pipes
head(rc_cent_nopipes)
## Simple feature collection with 6 features and 6 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 329796 ymin: 388345.3 xmax: 339351.7 ymax: 407204.2
## CRS: 27700
## rcID rcName n.units n.comp.units RetailPark hierarchy
## 1 RC_EW_2713 Wallasey 189 44 N tertiary
## 2 RC_EW_2720 Formby 76 25 N tertiary
## 3 RC_EW_2723 Waterloo 75 28 N tertiary
## 4 RC_EW_2746 Allerton Road 140 27 N tertiary
## 5 RC_EW_2751 Liverpool City 957 257 N primary
## 6 RC_EW_2756 Birkenhead 308 92 N secondary
## geom
## 1 POINT (330617 391938.6)
## 2 POINT (329796 407204.2)
## 3 POINT (332068 398533.3)
## 4 POINT (339351.7 388345.3)
## 5 POINT (334655.1 390278.6)
## 6 POINT (331913.4 388629.2)
## With pipes
head(rc_cent)
## Simple feature collection with 6 features and 6 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 329796 ymin: 388345.3 xmax: 339351.7 ymax: 407204.2
## CRS: 27700
## rcID rcName n.units n.comp.units RetailPark hierarchy
## 1 RC_EW_2713 Wallasey 189 44 N tertiary
## 2 RC_EW_2720 Formby 76 25 N tertiary
## 3 RC_EW_2723 Waterloo 75 28 N tertiary
## 4 RC_EW_2746 Allerton Road 140 27 N tertiary
## 5 RC_EW_2751 Liverpool City 957 257 N primary
## 6 RC_EW_2756 Birkenhead 308 92 N secondary
## geom
## 1 POINT (330617 391938.6)
## 2 POINT (329796 407204.2)
## 3 POINT (332068 398533.3)
## 4 POINT (339351.7 388345.3)
## 5 POINT (334655.1 390278.6)
## 6 POINT (331913.4 388629.2)
The two outputs are identical. Whether you choose to use pipes is up to you - they can really speed up data transformation, but at the same time make it harder to untangle some broken code. For more information and tutorials on piping and how to use it with different dplyr functions, please visit: https://seananderson.ca/2014/09/13/dplyr-intro/
Ok, so from the rc_cent object you can see that each Retail Centre has now been assigned a hierarchical category, which we will use throughout the rest of the practical in determing catchments for the centres, that differ based on the hierarchical position of the centres.
Fixed-Ring Buffers are the simplest catchments, and involve drawing a circular buffer around a store (or retail centre) based on a fixed value of distance, which accounts for the expected distances consumers are willing to travel. Fixed-Ring bufferscan be easily delineated in R using the st_buffer() function, where you give the function a simple features point object and distance (dependent on projection), and it returns a buffer for each individual point.
## Extract a 1000m buffer for each Retail Centre
buffer1km <- st_buffer(rc_cent, 1000)
Let's map these to see what they look like. Note: i am setting alpha to 0.3 so I can see overlapping buffers clearly.
## Map the buffers
tm_shape(buffer1km) + ## Plot the buffers
tm_fill(col = "orange", alpha = 0.3) +
tm_shape(rc_cent) + ## Overlay the centroids
tm_dots(col = "orange")
However, it is more informative to have buffers that vary in size depending on the hierarchical position of the Retail Centres. In this next chunk of code I have created a small function to do this. Functions are really helpful tools that enable you to apply a series of methods in sequence and return the output. This is particularly helpful when you need to repeat steps over and over again, as it removes the need for typing out the same lines of code multiple times. For those new to functions, watch the short video below: